Cluster based bit vector mining algorithm for finding frequent itemsets in temporal databases

نویسندگان

  • M. Krishnamurthy
  • Arputharaj Kannan
  • Ramachandran Baskaran
  • M. Kavitha
چکیده

In this paper, we introduce an efficient algorithm using a new technique to find frequent itemsets from a huge set of itemsets called Cluster based Bit Vectors for Association Rule Mining (CBVAR). In this work, all the items in a transaction are converted into bits (0 or 1). A cluster is created by scanning the database only once. Then frequent 1-itemsets are extracted directly from the cluster table. Moreover, frequent k-itemsets, where k 2 are obtained by using Logical AND between the items in a cluster table. This approach reduces main memory requirement since it considers only a small cluster at a time and as scalable for any large size of database. The overall performance of this method is significantly better than that of the previously developed algorithms for effective decision making.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

DBV-Miner: A Dynamic Bit-Vector approach for fast mining frequent closed itemsets

Frequent closed itemsets (FCI) play an important role in pruning redundant rules fast. Therefore, a lot of algorithms for mining FCI have been developed. Algorithms based on vertical data formats have some advantages in that they require scan databases once and compute the support of itemsets fast. Recent years, BitTable (Dong & Han, 2007) and IndexBitTable (Song, Yang, & Xu, 2008) approaches h...

متن کامل

Data sanitization in association rule mining based on impact factor

Data sanitization is a process that is used to promote the sharing of transactional databases among organizations and businesses, it alleviates concerns for individuals and organizations regarding the disclosure of sensitive patterns. It transforms the source database into a released database so that counterparts cannot discover the sensitive patterns and so data confidentiality is preserved ag...

متن کامل

MINING FUZZY TEMPORAL ITEMSETS WITHIN VARIOUS TIME INTERVALS IN QUANTITATIVE DATASETS

This research aims at proposing a new method for discovering frequent temporal itemsets in continuous subsets of a dataset with quantitative transactions. It is important to note that although these temporal itemsets may have relatively high textit{support} or occurrence within particular time intervals, they do not necessarily get similar textit{support} across the whole dataset, which makes i...

متن کامل

Fast Vertical Mining Using Boolean Algebra

The vertical association rules mining algorithm is an efficient mining method, which makes use of support sets of frequent itemsets to calculate the support of candidate itemsets. It overcomes the disadvantage of scanning database many times like Apriori algorithm. In vertical mining, frequent itemsets can be represented as a set of bit vectors in memory, which enables for fast computation. The...

متن کامل

An Efficient Mining Algorithm by Bit Vector Table for Frequent Closed Itemsets

Mining frequent closed itemsets in data streams is an important task in stream data mining. In this paper, an efficient mining algorithm (denoted as EMAFCI) for frequent closed itemsets in data stream is proposed. The algorithm is based on the sliding window model, and uses a Bit Vector Table (denoted as BVTable) where the transactions and itemsets are recorded by the column and row vectors res...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011